MOSAIC: Agglomerative Clustering with Gabriel Graphs
نویسندگان
چکیده
Representative-based clustering algorithms are quite popular due to their relative high speed and because of their sound theoretical foundation. On the other hand, the clusters they can obtain are limited to convex shapes and clustering results are also highly sensitive to initializations. In this paper, a novel agglomerative clustering algorithm called MOSAIC is proposed which greedily merges neighboring clusters maximizing a given fitness function. MOSAIC uses Gabriel graphs to determine which clusters are neighboring and approximates non-convex shapes as the unions of small clusters that have been computed using a representative-based clustering algorithm. We evaluate MOSAIC for traditional unsupervised clustering with kmeans and DBSCAN, and also for supervised clustering. The experimental results show that this technique leads to clusters of higher quality compared to running a representative clustering algorithm stand-alone. Given a suitable fitness function, MOSAIC is able to detect arbitrary shape clusters which are comparable to the ones generated by DBSCAN. In addition, MOSAIC is capable of dealing with high dimensional data. We also claim that MOSAIC can be employed as an effective post-processing clustering algorithm to further improve the quality of clustering.
منابع مشابه
MOSAIC: A Proximity Graph Approach for Agglomerative Clustering
Representative-based clustering algorithms are quite popular due to their relative high speed and because of their sound theoretical foundation. On the other hand, the clusters they can obtain are limited to convex shapes and clustering results are also highly sensitive to initializations. In this paper, a novel agglomerative clustering algorithm called MOSAIC is proposed which greedily merges ...
متن کامل2 Review of Agglomerative Hierarchical Clustering Algorithms
Hierarchical methods are well known clustering technique that can be potentially very useful for various data mining tasks. A hierarchical clustering scheme produces a sequence of clusterings in which each clustering is nested into the next clustering in the sequence. Since hierarchical clustering is a greedy search algorithm based on a local search, the merging decision made early in the agglo...
متن کاملMultilevel Refinement for Hierarchical Clustering
Hierarchical methods are well known clustering technique that can be potentially very useful for various data mining tasks. A hierarchical clustering scheme produces a sequence of clusterings in which each clustering is nested into the next clustering in the sequence. Since hierarchical clustering is a greedy search algorithm based on a local search, the merging decision made early in the agglo...
متن کاملA Survey on Efficient Clustering Methods with Effective Pruning Techniques for Probabilistic Graphs
This paper provides a survey on K-NN queries, DCR query, agglomerative complete linkage clustering and Extension of edit-distance-based definition graph algorithm and solving decision problems under uncertainty. This existing system give an beginning to Graph agglomeration aims to divide information into clusters per their similarities, and variety of algorithms are planned for agglomeration gr...
متن کاملA Survey on Efficient Clustering Methods with Effective Pruning Techniques for Probabilistic Graphs
This paper provides a survey on K-NN queries, DCR query, agglomerative complete linkage clustering and Extension of edit-distance-based definition graph algorithm and solving decision problems under uncertainty. This existing system give an beginning to Graph agglomeration aims to divide information into clusters per their similarities, and variety of algorithms are planned for agglomeration gr...
متن کامل